Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification

نویسندگان

  • Man-Wai Mak
  • Wei Rao
چکیده

Recent research has demonstrated the merit of combining Gaussian mixture models and support-vector-machine (SVM) for text-independent speaker verification. However, one unaddressed issue in this GMM–SVM approach is the imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique – namely utterance partitioning with acoustic vector resampling (UP-AVR) – to mitigate the data imbalance problem. Briefly, the sequence order of acoustic vectors in an enrollment utterance is first randomized, which is followed by partitioning the randomized sequence into a number of segments. Each of these segments is then used to produce a GMM supervector via MAP adaptation and mean vector concatenation. The randomization and partitioning processes are repeated several times to produce a sufficient number of speaker-class supervectors for training an SVM. Experimental evaluations based on the NIST 2002 and 2004 SRE suggest that UP-AVR can reduce the error rate of GMM–SVM systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic vector resampling for GMMSVM-based speaker verification

Using GMM-supervectors as the input to SVM classifiers (namely, GMM-SVM) is one of the promising approaches to text-independent speaker verification. However, one unaddressed issue of this approach is the severe imbalance between the numbers of speaker-class utterances and impostor-class utterances available for training a speaker-dependent SVM. This paper proposes a resampling technique – name...

متن کامل

Addressing the Data-Imbalance Problem in Kernel-Based Speaker Verification via Utterance Partitioning and Speaker Comparison

GMM-SVM has become a promising approach to textindependent speaker verification. However, a problematic issue of this approach is the extremely serious imbalance between the numbers of speaker-class and impostor-class utterances available for training the speaker-dependent SVMs. This data-imbalance problem can be addressed by (1) creating more speaker-class supervectors for SVM training through...

متن کامل

Utterance partitioning with acoustic vector resampling for i-vector based speaker verification

I-vector has become a state-of-the-art technique for textindependent speaker verification. The major advantage of ivectors is that they can represent speaker-dependent information in a low-dimension Euclidean space, which opens up opportunity for using statistical techniques to suppress sessionand channel-variability. This paper investigates the effect of varying the conversation length and the...

متن کامل

Text-Independent Speaker Verification via State Alignment

To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification...

متن کامل

GMM kernel by Taylor series for speaker verification

Currently, approach of Gaussian Mixture Model combined with Support Vector Machine to text-independent speaker verification task has produced the stat-of-the-art performance. Many kernels have been reported for combining GMM and SVM. In this paper, we propose a novel kernel to represent the GMM distribution by Taylor expansion theorem and it’s regarded as the input of SVM. The utterance-specifi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 53  شماره 

صفحات  -

تاریخ انتشار 2011